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Abstract. We consider two-player games played over finite state spaces for an infinite 
number of rounds. At each state, the players simultaneously choose moves; the moves 
determine a successor state. It is often advantageous for players to choose probability dis- 
tributions over moves, rather than single moves. Given a goal (e.g., "reach a target state"), 
the question of winning is thus a probabilistic one: "what is the maximal probability of 
winning from a given state?" . 

On these game structures, two fundamental notions are those of equivalences and met- 
rics. Given a set of winning conditions, two states are equivalent if the players can win 
the same games with the same probability from both states. Metrics provide a bound on 
the difference in the probabilities of winning across states, capturing a quantitative notion 
of state "similarity" . 

We introduce equivalences and metrics for two-player game structures, and we show 
that they characterize the difference in probability of winning games whose goals are 
expressed in the quantitative /i-calculus. The quantitative /i-calculus can express a large 
set of goals, including reachability, safety, and w-regular properties. Thus, we claim that 
our relations and metrics provide the canonical extensions to games, of the classical notion 
of bisimulation for transition systems. We develop our results both for equivalences and 
metrics, which generalize bisimulation, and for asymmetrical versions, which generalize 
simulation. 
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1. Introduction 

We consider two-player games played for an infinite number of rounds over finite state 
spaces. At each round, the players simultaneously and independently select moves; the 
moves then determine a probability distribution over successor states. These games, known 
variously as stochastic games [27) or concurrent games HJ [7] , generalize many common 
structures in computer science, from transition systems, to Markov chains [15] and Markov 
decision processes [8]. The games are turn-based if, at each state, at most one of the players 
has a choice of moves, and deterministic if the successor state is uniquely determined by 
the current state, and by the moves chosen by the players. 

It is well-known that in such games with simultaneous moves it is often advantageous for 
the players to randomize their moves, so that at each round, they play not a single "pure" 
move, but rather, a probability distribution over the available moves. These probability 
distributions over moves, called mixed moves [23], lead to various notions of equilibria 
|32[ [23] , such as the equilibrium result expressed by the minimax theorem [32] . Intuitively, 
the benefit of playing mixed, rather than pure, moves lies in preventing the adversary from 
tailoring a response to the individual move played. Even for simple reachability games, the 
use of mixed moves may allow players to win, with probability 1, games that they would 
lose (i.e., win with probability 0) if restricted to playing pure moves [5]. With mixed moves, 
the question of winning a game with respect to a goal is thus a probabilistic one: what is 
the maximal probability a player can be guaranteed of winning, regardless of how the other 
player plays? This probability is known, in brief, as the winning probability. 

In structures ranging from transition systems to Markov decision processes and games, 
a fundamental question is the one of equivalence of states. Given a suitably large class $ 
of properties, containing all properties of interest to the modeler, two states are equivalent 
if the same properties hold in both states. For a property ip, denote the value of (p at s by 
<p(s): in the case of games, this might represent the maximal probability of a player winning 
with respect to a goal expressed by (p. Two states s and t are equivalent if ip(s) = (p{t) 
for all ip € &. For (finite-branching) transition systems, and for the class of properties 
$ expressible in the //-calculus [UJ, state equivalence is captured by bisimulation [22]; for 
Markov decision processes, it is captured by probabilistic bisimulation [25J. For quantitative 
properties, a notion related to equivalence is that of a metric: a metric provides a tight 
bound for how much the value of a property can differ at states of the system, and provides 
thus a quantitative notion of similarity between states. Given a set <3? of properties, the 
metric distance of two states s and t can be defined as sup^g^ \ ip(s) — ip(t)\. Metrics for 
Markov decision processes have been studied in [9] [30], EU QUI E]. Obviously, the metrics 
and relations are connected, in the sense that the relations are the kernels of the metrics (the 
pairs of states having metric distance 0). The metrics and relations are at the heart of many 
verification techniques, from approximate reasoning (one can substitute states that are close 
in the metric) to system reductions (one can collapse equivalent states) to compositional 
reasoning and refinement (providing a notion of substitutivity of equivalents). 

We introduce metrics and equivalence relations for concurrent games, with respect to 
the class of properties $ expressible in the quantitative /u-calculus [3 [21] . We claim that 
these metrics and relations represent the canonical extension of bisimulation to games. We 
also introduce asymmetrical versions of these metrics and equivalences, which constitute 
the canonical extension of simulation. 
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An equivalence relation for deterministic games that are either turn-based, or where 
the players are constrained to playing pure moves, has been introduced in [2] and called 
alternating bisimulation. Relations and metrics for the general case of concurrent games 
have so far proved elusive, with some previous attempts at their definition by a subset of 
the authors following a subtly flawed approach [6l Qjj] . The cause of the difficulty goes to 
the heart of the definition of bisimulation. In the definition of bisimulation for transition 
systems, for every pair s, t of bisimilar states, we require that if s can go to a state s', 
then t should be able to go to t', such that s' and t' are again bisimilar (we also ask that 
s, t have an equivalent predicate valuation). This definition has been extended to Markov 
decision processes by requiring that for every mixed move from s, there is a mixed move 
from t, such that the moves induce probability distributions over successor states that are 
equivalent modulo the underlying bisimulation |25|, [23] . Unfortunately, the generalization 
of this appealing definition to games fails. It turns out, as we prove in this paper, that 
requiring players to be able to replicate probability distributions over successors (modulo 
the underlying equivalence) leads to an equivalence that is too fine, and that may fail 
to relate states at which the same quantitative //-calculus formulas hold. We show that 
phrasing the definition in terms of distributions over successor states is the wrong approach 
for games; rather, the definition should be phrased in terms of expectations of certain 
metric-bounded quantities. 

Our starting point is a closer look at the definition of metrics for Markov decision 
processes. We observe that we can manipulate the definition of metrics given in |31j . 
obtaining an alternative form, which we call the a priori form, in contrast with the original 
form of [31] . which we call the a posteriori form. Informally, the a posteriori form is the 
traditional definition, phrased in terms of similarity of probability distributions; the a priori 
form is instead phrased in terms of expectations. We show that, while on Markov decision 
processes these two forms coincide, this is not the case for games; moreover, we show that 
it is the a priori form that provides the canonical metrics for games. 

We prove that the a priori metric distance between two states s and t of a concurrent 
game is equal to sup^g^ \<p(s) — ip(t)\, where $ is the set of properties expressible via the 
quantitative //-calculus. This result can be summarized by saying that the quantitative 
/t-calculus provides a logical characterization for the a priori metrics, similar to the way the 
ordinary /t-calculus provides a logical characterization of bisimulation. Furthermore, we 
prove that a priori metrics — and their kernels, the a priori relations — satisfy a reciprocity 
property, stating that properties expressed in terms of player 1 and player 2 winning con- 
ditions have the same distinguishing power. This property is intimately connected to the 
fact that concurrent games, played with mixed moves, are determined for w-regular goals 
[20\ [7J: the probability that player 1 achieves a goal ip is one minus the probability that 
player 2 achieves the goal —iip. Reciprocity ensures that there is one, canonical, notion of 
game equivalence. This is in contrast to the case of alternating bisimulation of [2], in which 
there are distinct player 1 and player 2 versions, as a consequence of the fact that concurrent 
games, when played with pure moves, are not determined. The logical characterization and 
reciprocity result justify our claim that a priori metrics and relations are the canonical no- 
tion of metrics, and equivalence, for concurrent games. Neither the logical characterization 
nor the reciprocity result hold for the a posteriori metrics and relations. 

While this introduction focused mostly on metrics and equivalence relations, we also 
develop results for the asymmetrical versions of these notions, related to simulation. 
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2. Games and Goals 

We will develop metrics for game structures over a set S of states. We start with some 
preliminary definitions. For a finite set A, let Dist(A) = {p : A \— > [0, 1] | J2aeAP( a ) = 1} 
denote the set of probability distributions over A. We say that p G Dist(A) is deterministic 
if there is a € A such that p(a) = 1. 

For a set S, a valuation over S is a function / : S i— > [0, 1] associating with every 
element seSa value < f(s) < 1; we let T be the set of all valuations. For c G [0, 1], we 
denote by c the constant valuation such that c(s) = c at all s G S. We order valuations 
pointwise: for f,g G JF, we write f < g iS f(s) < g(s) at all s G 5; we remark that T, 
under <, forms a complete lattice. 

Given a, b G 1R, we write a U b = max{a, b}, and a V~\ b = min{a,o}; we also let 
a(Bb = min{l, max{0, a + b}} and aQb = max{0, min{l, a — b}}. We extend PI, U, +,—,©, 
to valuations by interpreting them in pointwise fashion. 

A directed metric is a function d : S 2 i— > M>o which satisfies d(s, s) = and d(s, t) < 
d(s, u) + d(u, t) for all s,t,u G S". We denote by .M C 5 2 i— > IR the space of all metrics; this 
space, ordered pointwise, forms a lattice which we indicate with (M, <). Given a metric 
d G A4, we denote by d its opposite version, defined by d(s,t) = d(t,s) for all s,t G S; we 
say that d is symmetrical if d = d. 

2.1. Game Structures. We assume a fixed, finite set V of observation variables. A (two- 
player, concurrent) game structure G = (S, [■], Moves, T±, T2, 6} consists of the following 
components [H[5]: 

• A finite set S of states. 

• A variable interpretation [•] : V X S 1— ► [0, 1], which associates with each variable v G V a 
valuation [u]. 

• A finite set Moves of moves. 

• Two move assignments ri, T2: S 1— ► 2 Mo " es \0. For i G {1, 2}, the assignment Tj associates 
with each state s G S the nonempty set T^s) C Moves of moves available to player i at 
state s. 

• A probabilistic transition function 5: S x Moves x Moves 1— > Dist(5*), that gives the 
probability <5(s, ai, 02) (i) of a transition from s to i when player 1 plays move ai and 
player 2 plays move ai- 

At every state s 6 S, player 1 chooses a move ai G ri(s), and simultaneously and inde- 
pendently player 2 chooses a move ai G ^(s). The game then proceeds to the succes- 
sor state t G S with probability 8(s,ai,a,2)(t). We denote by Dest(s, a%, 02) = {t G S* | 
5(s,ai, a2)(t) > 0} the set of destination states when actions 01,02 are chosen at s. The 
variables in V naturally induce an equivalence on states: for states s, t, define s = t if for all 
v G V we have [u](s) = In the following, unless otherwise noted, the definitions refer 

to a game structure with components G = (S, [•], Moves, Ti, T2, 8). For player i G {1,2}, 
we write ~i = 3 — i for the opponent. We also consider the following subclasses of game 
structures. 

• Turn-based game structures. A game structure G is turn-based if we can write S as the 
disjoint union of two sets: the set Si of player 1 states, and the set S2 of player 2 states, 
such that s G S% implies |T2 (s) | = 1, and s G S*2 implies |Ti (s) | = 1, and further, there 
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is a special variable turn £ V, such that [turn](s) = 1 iff s G Si, and [turn](s) = iff 
s G S2- thus, the variable turn indicates whose turn it is to play at a state. 

• Markov decision processes. A game structure G is a Markov decision process (MDP) [8] if 
only one of the two players has a choice of moves. For i G {1, 2}, we say that a structure 
is an i-MDP if Ms G S, \T^i(s)\ = 1. For MDPs, we omit the (single) move of the player 
without a choice of moves, and write 5(s, a) for the transition function. 

• Deterministic game structures. A game structure G is deterministic if, for all s G S, 
a\ G Moves, and 02 G Moves, there exists a i G S such that 5(s, ai, 02)^) = 1; we 
denote such t by r(s, 01,02). We sometimes call probabilistic a general game structure, 
to emphasize the fact that it is not necessarily deterministic. 

Note that MDPs can be seen as turn-based games by setting [torn] = 1 for 1-MDPs and 
[turn] = for 2-MDPs. 

Pure and mixed moves. A mixed move is a probability distribution over the moves 
available to a player at a state. We denote by T>i{s) = Dist(Fj(s)) the set of mixed moves 
available to player i G {1,2} at s € S. The moves in Moves are called pure moves, in 
contrast to mixed moves. We extend the transition function to mixed moves. For s G S 
and x\ G V\(s), X2 G T>2(s), we write 5(s, x\, X2) for the next-state probability distribution 
induced by the mixed moves x\ and X2, defined for all t G S by 

5(s,x 1 ,x 2 )(t) = S(s,a 1 ,a 2 )(t) xi(ai) x 2 (a 2 ) ■ 

aieri(s) a 2 er 2 (s) 

In the following, we sometimes restrict the moves of the players to pure moves. We identify 
a pure move a G Ti(s) available to player i G {1,2} at a state s with a deterministic 
distribution that plays a with probability 1. 

The deterministic setting. The deterministic setting is obtained by considering deter- 
ministic game structures, with players restricted to playing pure moves. 



2.2. Predecessor operators. Given a valuation / G T , a state s G S, and two mixed 
moves x\ G T>\{s) and X2 G T>2{s), we define the expectation of / from s under x\,X2'- 

teS 

For a game structure G, for i G {1,2} we define the valuation transformer Pre^ : T ^ T 
by, for all / G J 7 and s £ S, 

Pre; (/)(*)= sup inf E™(/) . 

Intuitively, Prej(/)(s) is the maximal expectation player i can achieve of / after one step 
from s: this is the classical "one-day" or "next-stage" operator of the theory of repeated 
games [12]. We also define a deterministic version of this operator, in which players are 
forced to play pure moves: 

Pref (/)(*)= max min Ef^(/) . 
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2.3. Quantitative /^-calculus. We consider the set of properties expressed by the quanti- 
tative ^-calculus (qfi). As discussed in [I3E1ET], a large set of properties can be encoded 
in qfi, spanning from basic properties such as maximal reachability and safety probability, 
to the maximal probability of satisfying a general w-regular specification. 

Syntax. The syntax of quantitative //-calculus is defined with respect to the set of obser- 
vation variables V as well as a set MVars of calculus variables, which are distinct from the 
observation variables in V. The syntax is given as follows: 

ip ::= c | v | Z | -><p \ ipV<p\ipA(p\ip(Bc\(pQc \ pre 1 (</?) | pre 2 (v) | fJ>Z. tp \ vZ. ip 

for constants c G [0,1], observation variables v £ V, and calculus variables Z € MVars. 
In the formulas fiZ. tp and vZ. ip, we furthermore require that all occurrences of the bound 
variable Z in <p occur in the scope of an even number of occurrences of the complement 
operator -r. A formula <p is closed if every calculus variable Z in <p occurs in the scope of 
a quantifier \xZ or vZ . From now on, with abuse of notation, we denote by q\i the set of 
closed formulas of q[i. A formula is a player i formula, for i € {1,2}, if ip does not contain 
the pre^j operator; we denote with the syntactic subset of qfi consisting only of closed 
player i formulas. A formula is in positive form if the negation appears only in front of 
observation variables, i.e., in the context ->v; we denote with g/x + and qfif the subsets of 
q\i and qp, i consisting only of positive formulas. 

We remark that the fixpoint operators fi and v will not be needed to achieve our 
results on the logical characterization of game relations. They have been included in the 
calculus because they allow the expression of many interesting properties, such as safety, 
reachability, and in general, w-regular properties. The operators © and 0, on the other 
hand, are necessary for our results. 

Semantics. A variable valuation £: MVars i— > J- is a function that maps every variable 
Z € MVars to a valuation in T . We write £[Z \— > /] for the valuation that agrees with £ 
on all variables, except that Z is mapped to /. Given a game structure G and a variable 
valuation £, every formula cp of the quantitative /x-calculus defines a valuation [(p}^ G T 
(the superscript G is omitted if the game structure is clear from the context): 

bh = c 
Mi = M 

= i - yjz 

MA>2] C = biMn} Mt 
[pre^)] c = Prei(M € ) 

I{t t }^^ = {s i up}U^|/ = M^ /] } 

where i € {1>2}. The existence of the fixpoints is guaranteed by the monotonicity and 
continuity of all operators and can be computed by Picard iteration [7]. If cp is closed, {(pj^ 
is independent of £, and we write simply {ip}. 

We also define a deterministic semantics [-] r for qfi, in which players can select only 
pure moves in the operators pre x , pre 2 . [-] r is defined as [■], except for the clause 

Ipre^)l f r = Pref(M[). 
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Example 1. Given a set T C S, the characteristic valuation T of T is denned by T(s) = 1 
if s E T, and T(s) = otherwise. With this notation, the maximal probability with which 
player i G {1, 2} can ensure eventually reaching T C S is given by |//Z.(T V pre^Z))], and 
the maximal probability with which player i can guarantee staying in T forever is given by 
{vZ.(T A prej(Z))] (see, e.g., [7]). The first property is called a reachability property, the 
second a safety property. 

3. Metrics 

We are interested in developing a metric on states of a game structure that captures an 
approximate notion of equivalence: states close in the metric should yield similar values to 
the players for any winning objective. Specifically, we are interested in defining a bisimula- 
tion metric [~ g ] e M. such that for any game structure G and states s, t of G, the following 
continuity property holds: 

[^ g ](s,t) = mp\M(8)-M(t)\. (3.1) 

In particular, the kernel of the metric, that is, states at distance 0, are equivalent: each 
player can get exactly the same value from either state for any objective. Notice that in 
defining the metric independent of a player, we are expecting our metrics to be reciprocal, 
that is, invariant under a change of player. Reciprocity is expected to hold since the 
underlying games we consider are determined — for any game, the value obtained by player 
2 is one minus the value obtained by player 1 — and yields canonical metrics on games. 

Thus, our metrics will generalize equivalence and refinement relations that have been 
studied on MDPs and in the deterministic setting. To underline the connection between 
classical equivalences and the metrics we develop, we write [s ~ g t] for [~ s ](s,t), so that 
the desired property of the bisimulation metric can be stated as 

[ S ^ g t}= S up\M( s )-y}(t)\. 

Metrics of this type have already been developed for Markov decision processes (MDPs) 
[30U10j . Our construction of metrics for games starts from an analysis of these constructions. 

3.1. Metrics for MDPs. We consider the case of 1-MDPs; the case for 2-MDPs is sym- 
metrical. Throughout this subsection, we fix a 1-MDP (S, [■], Moves, F\, T2, 5). Before we 
present the metric correspondent of probabilistic simulation, we first rephrase classical prob- 
abilistic (bi) simulation on MDPs |18[ [HI l25l [26] as a fixpoint of a relation transformer. As 
a first step, we lift relations between states to relations between distributions. Given a 
relation R C S x S and two distributions p,q € Dist(5), we let p q if there is a function 
A : S x S -> [0, 1] such that: 

• A(s, s') > implies (s, s') € R; 

• PO) = Es'es A ( s , s ') for any s £ S; 

• q{ s> ) = S s e5^( s ' s ') f° r an y s ' £ 

To rephrase probabilistic simulation, we define the relation transformer F : 2 SxS 1— > 2 SxS 
as follows. For all relations R C S x S and s,t G S, we let (s,t) E F(R) iff 

s = t A VziePi(fl) • 3yi£X>i(i) . 5{s, Xl ) Q R 5{t, yi ), (3.2) 
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for all states s,t 6 S. Probabilistic simulation is the greatest fixpoint of f)3.2|) : probabilistic 
bisimulation is the greatest symmetrical fixpoint of (|3.2p . 

To obtain a metric equivalent of probabilistic simulation, we lift the above fixpoint 
from relations (subsets of S 2 ) to metrics (maps S 2 i— > M). First, we define [=] € for 
all s, t € S by [s = t] = max^gy IH( S ) ~~ M(*)l- Second, we lift f|3.2|) to metrics, defining 
a metric transformer Hp^f p : M i-> Al For all d € A(, let D(5(s, x\), 5(t, yi))(d) be the 
distribution distance between 8(s,xx) and 5(t,yi) with respect to the metric (f. We will 
show later how to define such a distribution distance. For s,t G S, we let 

ff^C'OM) = = sup inf D(5(s,xi),<5(t,yi))(d) . (3.3) 

a;ie©i(s) yieV!(t) 

In this definition, the V and 3 of (|3.2p have been replaced by sup and inf, respectively. 
Since equivalent states should have distance 0, the simulation metric in MDPs is defined as 
the least (rather than greatest) fixpoint of (13.31) [SOldO]. Similarly, the bisimulation metric 
is defined as the least symmetrical fixpoint of ()3.3f) . 

For a distance d 6 Ai and two distributions p,q € Dist(<f>), the distribution distance 
D(p,q)(d) is a measure of how much "work" we have to do to make p look like q, given 
that moving a unit of probability mass from s S S to t G S has cost d(s, t). More precisely, 
D(p,q)(d) is defined via the trans-shipping problem, as the minimum cost of shipping the 
distribution p into q, with edge costs d. Thus, D(p,q)(d) is the solution of the following 
linear programming (LP) problem over the set of variables {A Si t} S) 4gs: 

Minimize d(s,t)X S) t 
s,teS 

subject to ^2 = -P( s )> ^ s '* = ^ s >* - • 

Equivalently, we can define D(p,q)(d) via the dual of the above LP problem [30J. Given a 
metric d £ .M, let C(d) C be the subset of valuations k ^ T such that k(s) — k(t) < d(s, t) 
for all s,t E S. Then the dual formulation is: 

Maximize ^^p(s) — q(s)k(s) (3-4) 
subject to k G C(cZ) . 

The constraint C(d) on the valuation k, states that the value of k across states cannot differ 
by more than d. This means, intuitively, that k behaves like the valuation of a q[i formula: 
as we will see, the logical characterization implies that d is a bound for the difference in 
valuation of qjj, formulas across states. Indeed, the logical characterization of the metrics 
is proved by constructing formulas whose valuation approximate that of the optimal k. 
Plugging (|3.4j) into f|3.3|) . we obtain: 

H^ st DP (d)(s,t) = [s = t]U sup inf sup (Ef (fc) - Ef(fc)) . (3.5) 

x 1 ev 1 (s) yieVi(t) kec{d) 

We can interpret this definition as follows. State t is trying to simulate state s (this is a 
definition of a simulation metric). First, state s chooses a mixed move x\, attempting to 
make simulation as hard as possible; then, state t chooses a mixed move yi, trying to match 
the effect of x\. Once x\ and y\ have been chosen, the resulting distance between s and 
t is equal to the maximal difference in expectation, for moves x% and yi, of a valuation 
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k G C(d). We call the metric transformer H^P P the a posteriori metric transformer: the 
valuation k in (|3.5p is chosen after the moves x\ and yi are chosen. We can define an a 
priori metric transformer, where k is chosen before x\ and y±: 

Hpri DP (d)(s,t) = [s = t] U sup sup inf (if (it) - Ef (fc)) . (3.6) 

Intuitively, in the a priori transformer, first a valuation G C(d) is chosen. Then, state 
t must simulate state s with respect to the expectation of k. State s chooses a move x%, 
trying to maximize the difference in expectations, and state t chooses a move y%, trying to 
minimize it. The distance between s and t is then equal to the difference in the resulting 
expectations of k. 

Theorem 13.11 below states that for MDPs, a priori and a posteriori simulation metrics 
coincide. In the next section, we will see that this is not the case for games. 

Theorem 3.1. For all MDPs, H^ t DP = H^ p . 

Proof. Consider two states s,t G S, and a metric d G M. We have to prove that 

sup sup inf [Ef (k) - Ef (k)] = sup inf sup[Ef 1 (k) - Ef (k)] . (3.7) 

k xi Vi x x 2/i k 

In the left-hand side, we can exchange the two outer sups. Then, noticing that the 
difference in expectation is bi-linear in k and yi for a fixed x±, that y\ is a probability 
distribution, and that k is chosen from a compact convex subset, we apply the generalized 
minimax theorem [28] to exchange sup fc inf^ into inf yi sup fc , thus obtaining the right-hand 
side.l 

The metrics defined above are logically characterized by q^i. Precisely, let [~] G M. be 
the least symmetrical fixpoint of H^^ p = . Then, Lemma 5.24 and Corollary 5.25 

of [TO], (originally stated for H^ 1 ") state that for all states s,t of a 1-MDP, we have 

[a~t]=sup|M(s)-M(t)|. 

<p€qfl 



3.2. Metrics for Concurrent Games. We now extend the simulation and bisimulation 
metrics from MDPs to general game structures. As we shall see, unlike for MDPs, the a 
priori and the a posteriori metrics do not coincide over games. In particular, we show that 
the a priori formulation satisfies both a tight logical characterization as well as reciprocity 
while, perhaps surprisingly, the more natural a posteriori version does not. 

A posteriori metrics are defined via the metric transformer : M i— > A4 as follows, 
for all d G M and s,t G S: 

Hc :1 (d)(s,t) = [s = t] U sup inf sup inf D(6(s, x±, X2), 8(t, 2/1,2/2)) d) 

xi6Z?i(s) ifi6Z?i(t) w6Z? 3 (t) x 2 eV 2 (s) 

= [s = t)\J sup inf sup inf sup (E^{k) - Ef ' m (k)) . 
xi6X>i(s) 2/i6X»i(i) 2/ 2 eZM*) x 2 ex> 2 (s) fcec(d) 

(3.8) 
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A priori metrics are defined by bringing the sup^ outside. Precisely, we define a metric 
transformer : A4 i— > A4 as follows, for all d G M and s,t € S: 

H dl (d)(s,t) = [a = t]U sup sup inf sup inf (EJ 1 '* 2 ^) - E^' M (fc)) 

fcgc(d) xie©i(s) yiGX>i(i) y 2 &v 2 (t) x 2 ev 2 (s) 

= [s = t]U sup [ sup inf Ef'* 2 ^)- sup inf Ef' y2 (£:) 



fcec(d) 



anex>i(s) i 2 ei>2(s) yieDi(t) y 2 &v 2 {t) 



= [a = t]U sup (Prei(fc)(s) - Prei(fc)(t)) . (3.9) 

kec(d) 

First, we show that and are monotonic in the lattice of metrics (Ai, <). 

Lemma 3.2. T/ie functions H-^ t and H\z r are monotonic in the lattice of metrics (Ai, <). 

Proof. For d, d' G Ai, d < d' implies C(d) C C(d'), and hence sup fcgC .( rf )(Prei(fe)(s) — 
Prei (&)(£)) < sup fcgC ( rf /)(Prei(A:)(s) — Prei(fe)(t)). This shows the monotonicity of H^. 

The monotonicity of can be shown in a similar fashion. From d < d' , reasoning as 
before we obtain 

sup (Ef *>(k) -Ef m (k)) < sup (Ef' X2 (k)-Ef> m (k)) . 

fceC(d) keC(d') 

The result then follows from the monotonicity of the operators sup a;ie x> 1 (s)) ™^viex»i(t)> 

SU Py2€l32(*)' ln 4 2 6»2(s)- 1 

On the basis of this lemma, we can define the least fixpoints of H~^ 1 and Hq 1 , which 
will yield our game simulation and bisimulation metrics. 

Definition 3.3. A priori metrics: 

• The a priori simulation metric is the least fixpoint of fl'-< 1 . 

• The a priori bisimulation metric [—i] is the least symmetrical fixpoint of H~^ 1 . 
A posteriori metrics: 

• The a posteriori game simulation metric [Cj is the least fixpoint of JTc r 

• The a posteriori game bisimulation metric [=i] is the least symmetrical fixpoint of 

By exchanging the roles of the players, we define the metric transformers H^ 2 and i?c 2 , 
and the metrics [^ 2 ], [—2], [E2], [=2]- 

We note that the a posteriori simulation metric [Ci] has been introduced in [UJII2]. We 
also note that the a posteriori bisimulation metric [=i] can be defined as the least fixpoint 
of : M h-> A4, defined for all d G .M and i G {1, 2} by 

H^(d) = H Ql (d) U Opp(H Ql (d)), (3-10) 

where Opp(d) = d denotes the opposite of a metric d. Similarly, the a priori bisimulation 
metric [~j] can be defined as the least fixpoint of : A4 1— > A4, defined for all d G 
and i G {1,2} by 

ff Kl (d) = ff-^d) U Opp(H^(d)) . (3.11) 
We wish to show that the metrics of Definition 13.31 can be computed via Picard iteration. 
To this end, it is necessary to show that the operators and on the lattice (Ai, <) 
are upper semi-continuous. In fact, a very similar proof shows that the operators are lower 
semi-continuous, and thus, continuous; we omit the proof of this more general fact as it is 
not required for the desired result about the applicability of Picard iteration. 
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Lemma 3.4. The operators and Hq ± on the lattice (M, <) are upper semi- continuous. 

Proof. Let D C M. be an arbitrary set of distances, and let d* = supD; note that d* exists, 
as (M, <) is a complete lattice. 

We first prove the result for . We need to prove that (sup D) = sup d£D (d), 
which we abbreviate (sup D) = sup H-^ x (D). In one direction, (sup D) > sup (D) 
follows from the monotonicity of H^, 1 (Lemma 13 . 2[) . In the other direction, we will show 
that for all e > 0, there is d E D such that \H~i 1 (d*) — H^ 1 (d)\ < e, where for d, d' E Ai, 
\d—d'\ is the 1-norm distance between d and d' . For convenience, let G{k) E M be defined as 
G(k)(s,t) = Prei(fc)(s) — Prei(/c)(i), so that we can write (d) = [s = t] Usup fceC -( d ) G(k). 

Given e > 0, choose d E D such that for all s,t E S, we have d(s,t)/d* (s,t) > 1 — e/4 
if d*(s,t) > 0, and d(s,t) = if d*(s,t) = 0. Note that for all k E C(d*), we have 
(1 - e/4)/c E C(d) and \k - (1 - e/4)fe| < e/4, as |&| < 1. Thus, d E D is such that for 
all /c E C(tf*), there is k! E C(<i) with \k — k'\ < e/4. In other words, d is such that the 
Hausdorff distance between C(d*) and C(d) is at most e/4. We now prove that for this d, 
we have 

| sup G(k)- sup G(Jfe)|<e. (3.12) 

kec(d*) kec(d) 

In fact, let k* E C(d*) be such that 

\G(k*) - sup G(k)\ < e/2 . (3.13) 

kec(d*) 

and let k! E C(<i) be such that \k* — k'\ < e/4. For s,t E S, we have by definition 
G(k*)(s,t) =Prei(A;*)(s)-Prei(A;*)(i); let 

xi(s)=arg sup inf Ef ,2/ (F) . 

By employing x\(s) at all s E S 1 , player 1 can guarantee 

\G(k')(s,t)-G(k*)(s,t)\<e/2, 

which together with (|3. 13|) leads to (|3.12|) . In turn, (|3.12p yields the result. 

We can prove the result for following a similar argument. Precisely, in one direction, 
(sup D) > sup H[Zi (D) follows from the monotonicity of (Lemma l3.2p . In the other 
direction, we will show that for all e > 0, there is d E D such that |-ffc 1 (d*) — H^ 1 (d)\ < e, 
where for d, d' E M, \d — d'\ is the 1-norm distance between d and d' . Again, let d be such 
that the Hausdorff distance between C(d*) and C(d) is at most e/2. For such a d, we have 
that for all s,t E S, and x\ E T>\(s), y\ E T>\{t), X2 E T>2{s), y2 E T>2(t), 



sup (Ef 1 '* 2 ^) -Ef ,2/2 (fc)) - sup (Ef 1 '* 2 ^) -Ef ' V2 (k)) 

keC(d') keC{d) 



and this leads easily to the result. I 

This result implies that we can compute [^i] as the fixpoint of -ff-< 1 via Picard iteration; 
we denote by d n = H™ 1 (0) the n-iterate of this. Similarly, we can compute [Ci] as the 
fixpoint of H[Zi via Picard iteration. 

Theorem 3.5. The following assertions hold, for i E {1,2}: 
(1) Let do = d' = 0, and for n > 0, let 

d n+ i = H± r (d n ) and d' n+1 = H^d^) . (3.14) 
We have lim^oo d n = [^;] and lim^oo d' n = [C f ] . 
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(2) Let bo = 6q = 0, and for n > 0, let 

b n+1 = H^(b n ) U Opp(H^(b n )) and b' n+1 = H Qi {b' n ) U Opp(H^{b' n )) . (3.15) 

We have lim^oc b n = [~j] and lim n ^oo b' n = [=;] . 

Proof. The statements follow from the definitions of the metrics, and from Lemmas 13.21 
andOLl ■ 

We now show some basic properties of these metrics. First, we show that the a priori 
fixpoints give a (directed) metric, i.e., they are non-negative and satisfy the triangle inequal- 
ity. We also prove that the a priori and a posteriori metrics are distinct. We then focus on 
the a priori metrics, and show, through our results, that they are the natural metrics for 
concurrent games. 

Theorem 3.6. For all game structures G, and all states s, t, u of G, we have, 

(1) [s <i t] > and [s ^ u] < [s <i t] + [t ^ it]. 

(2) [s Ei t] > and [s C x u] < [s Q t] + [t Q it]. 

Proof. We prove the following statement: if d € M is a directed metric, then: 

(1) H^^d) is a directed metric; 

(2) i^Cj (d) is a directed metric. 

The theorem then follows by induction on the Picard iteration with which the a priori and 
a posteriori metrics can be computed (Theorem 13 . 5[) . We prove the result first for the a 
priori metric. 

First, from d' = H^ 1 (d) and [=] > 0, we immediately have d' > (where inequalities 
are interpreted in pointwise fashion). 

To prove the triangle inequality, we observe that [s = t] + [t = u] > [s = u] for all 
s,t,u £ S. Also, 

sup (Prei(fc)(s) — Prei (&)(*)) + sup (Prei - Prei (&)(«)) 

keC(d) keC(d) 

> sup (Prei(fc)(s) - Prei(fc)(i) + Prei (&)(*) - Prei(fc)(u)) 

fcec(d) 

= sup (Prei(fc)(s) -Prei(fe)(u)) . 

Thus, we obtain 
H^(d)(s,t) + H^(d)(t,u) 

= ([s = t]U sup (Prei(jfe)(s) - Prei(fc)(i))) + ([t = u] U sup (Prei (k) (t) - Prei (k) (u)) ) 

fcec(d) feec(d) 

> ([a = u] U sup (Prei(jfe)(s) - Prei (*)(«))) = fl^ x (d)(fl,u), 

leading to the result. 

For the a posteriori metric, let c?' = ii^i (d); again, we can prove d' > as in the a priori 
case. To prove the triangle inequality for d', for s,t & S, and for distributions x\ G ^i(s) 
and yi E £>i(i), it is convenient to let 

G{ Xl , yi )(s,t) = sup inf sup (Ef' X2 {k) -Ef' OT (fc)), 
y2&T> 2 {t) x 2 &v 2 {s) kec(d) 
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With this notation, for s,t,u £ S, we have 

H^ 1 (d)(s,u) = [s = u]U sup inf G{x\, Zi)(s, u) . (3.16) 

xi£T>i(s) 

Intuitively, the quantity G(xi, z{)(s, u) is the distance between s and u computed in the 
2-MDP obtained when player 1 plays x\ at s and z\ at u. As a consequence of Theorem 13. II 
(interpreted over 2-MDPs), and of the previous proof for the a-priori case, we have that 

G(x 1 , zi)(s,u) < G(xx,y x ){s,t) + G(y x , Zl ){t,u) . (3.17) 

for all x\ G T>\ (s), y\ G T>i(t), and Z\ £ This observation will be useful in the 

following. 

For any e > 0, let x\ realize the sup in (|3.16p within e, that is, 

inf G(xi, Zi)(s, u) > sup inf G{x\, Zi)(s, u) — e, (3.18) 

xieX>i(s)3i6X»i(u) 

and let z\ realize the inf of the left-hand side of (|3.18p also within e. Intuitively, x\ is the 
player-1 distribution at s that is hardest to imitate from u, and z* is the best imitation of x\ 
available at u. In the same fashion, let y* realize the inf within e in ml yieVl ^ G(x\, yi)(s, t), 
and let z[ realize the inf within e in inf 2ig:Dl ( u ) G(y*,zi)(t,u). In intuitive terms, y* is the 
imitator of x\ in t, and z[ is the imitator of y\ in u. 

We consider two cases. If [s = u] = 1, then we are sure that the triangle inequality 

d'(s,u) < d'(s,t) + d'(t,u), (3.19) 

holds. Otherwise, note that 

d'(s,u) < G(x*,4)(a,u) + 2e . (3.20) 

Since x\ is not necessarily the distribution at s that is hardest to imitate from t, and since 
y* is not necessarily the distribution at t that is hardest to imitate from u, we also have: 

d'(s,t)>G(xl,yl)(s,t)-e d'(t,u) > G(y$, z[)(t,u) - e . (3.21) 

Since the triangle inequality holds for MDPs, as stated by (|3.17[) . we have 

G(xl,z[){8,u) < G{xl,y* 1 )(s,t) + G(yl,z' 1 )(t,u) < d'(s, t) + d'(t, u) + 2e . (3.22) 

Since z* is the best imitator of x\ at u, we also have 

G(xl,zl){s,u) - e< G{xl,z[)(s,u), (3.23) 

which together with (|3.22p yields 

G(xl,zl)(s,u) < d'(s,t) + d'(t,u) + 3e . (3.24) 

From the choice of x*, this finally leads to 

d'(s, u) < d'(s, t) + d'(t, u) + 5e, 

for all e > 0, which yields the desired triangle inequality (|3.19p . I 
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Figure 1: A game that shows that the a priori and the a posteriori metrics may not coincide. 

The tables above show the transition probabilities from states t and s to states 
w and u for pure moves of the two players. The row player is player 1 and the 
column player is player 2. The line below is the two dimensional probability 
simplex that shows the transition probabilities induced by convex combinations 
of pure moves of the two players. 



3.3. A priori and a posteriori metrics are distinct. First, we show that a priori and 
a posteriori metrics are distinct in general: the a priori metric never exceeds the a posteriori 
one, and there are concurrent games where it is strictly smaller. Intuitively, this can be 
explained as follows. Simulation entails trying to simulate the expectation of a valuation k, 
as we see from (|3.8h . (|3.9[) . It is easier to simulate a state s from a state t if the valuation 
is known in advance, as in a priori metrics (|3.9p . than if the valuation k is chosen after all 
the moves have been chosen, as in a posteriori metrics (|3.8[) . 

As a special case, we shall see that equality holds for turn-based game structures, in 
addition to MDPs as we have seen in the previous subsection. 

Theorem 3.7. The following assertions hold. 

(1) For all game structures G, and for all states s,t of G, we have [s ■<% t] < [s Cj. t\. 

(2) There is a game structure G, and states s, t of G, such that [s t] = and [s Q\ t] > 0. 

(3) For all turn-based game structures, we have [^J = [Ei]- 

Proof. The first assertion is a consequence of the fact that, for all functions / : IR 2 i— > IR, 
we have sup x . inf^ f(x, y) < mf y sup x f(x, y). By repeated applications of this, we can show 
that, for all d € M, we have H^(d) < H^{d) (with pointwise ordering). The result then 
follows from the monotonicity of and H\z- 

For the second assertion, we give an example where a priori distances are strictly less 
than a posteriori distances. Consider a game with states S = {s,t,u,w}. States u and w 
are sink states with [u = w] = 1; states s and t are such that [s = t] = 0. At states s and 
t, player 2 has moves {/, g}. Player-1 has a single move {a} at state s, and moves {b, c} at 
state t. The moves from s and t lead to u and w with transition probabilities indicated in 



GAME REFINEMENT RELATIONS AND METRICS 



15 



Figure [H In the figure, the point b, f indicates the probability of going to u and w when the 
move pair (6, /) is played, with 5(s, b, f)(u) + 5(s, b, f)(w) = 1; similarly for the other move 
pairs. The thick line segment between the points a, f and a, g represents the transition 
probabilities arising when player 1 plays move a, and player 2 plays a mixed move (a mix 
of / and g). 

We show that, in this game, we have [s Ci t] > 0. Consider the metric d where 
d(u,w) = 1 (recall that [u = w] = 1, and note the other distances do not matter, since u, 
w are the only two destinations). We need to show 



Vyi G V 1 (t).3y 2 G V 2 (t)Xx 2 G V 2 {s)3k G C(d).(Ef X2 (k) - Ef ' m (k)) > . (3.25) 



Consider any mixed move y\ = ab + (1 — a)c, where b, c are the moves available to player 1 
at t, and < a < 1. If a > |, choose move / from t as y 2 , and choose k(w) = 1, k(u) = 0. 
Otherwise, choose move g from t as y2, and choose fc(u>) = 0, k(u) = 1. With these 
choices, the transition probability 8(t,yi,y 2 ) will fall outside of the segment [(a, f), (a, g)] 
in Figure [TJ Thus, with the choice of k above, we ensure that the difference in (|3.25p is 
always positive. 

To show that in the game we have [s t] = 0, it suffices to show (given that [s 
t] > 0) that 

VA: G C(d)3 Vl G Pi(*).Vya G P 2 (t).3x 2 G V 2 (s).(E a s ^(k) - Ef lJ/2 (fc)) < . 

If = /c(u>), the result is immediate. Assume otherwise, that fc(it) < k(w), and choose 
yi = c. For every 1/2 j the distribution over successor states (and of fc-expectations) will be in 
the interval [(c, /), (c,g)] in Figure [TJ By choosing x 2 = /, we have that Eg (fc) < Ef V2 (k) 
for all y 2 G T> 2 (t), leading to the result. Similarly if k(u) > k(w), by choosing y\ = b, 
the distribution over successor states (and of /c-expectations) will now be in the interval 
[(6, /), (b,g)]. By choosing x 2 = g, we have that E°' 9 (fc) < E h t )V2 {k) for all y 2 G T) 2 (t), again 
leading to the result. 

The last assertion of the theorem is proved in the same way as Theorem 13.11 I 

3.4. Reciprocity of a priori metric. The previous theorem establishes that the a priori 
and a posteriori metrics are in general distinct. We now prove that it is the a priori metric, 
rather than the a posteriori one, that enjoys reciprocity, and that provides a (quantitative) 
logical characterization of q/x. We begin by considering reciprocity. 

Theorem 3.8. The following assertions hold. 

(1) For all game structures G, we have [<i] = [^ 2 ], and [~i] = [— 2]- 

(2) There is a concurrent game structure G, with states s and t, where [Ci] 7^ p2]- 

(3) There is a concurrent game structure G, with states s and t, where 7^ [— 2 ]- 

Proof. For the first assertion, it suffices to show that, for all d G A4, and states s,t G S, we 
have H-^ 1 (d)(s,t) = H-^ 2 (d)(t, s). We proceed as follows: 




(3.26) 



keC(d) 




(3.27) 



sup (Pve 2 (k)(t) -Pie 2 (k)(s)) . 



(3.28) 



k£C(d) 
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The step from ggp to (|3T27j) uses Prei(A;)(s) = 1 - Pre 2 (l - k)(s) pElE], and the step 
from (|3.27p to (|3.28p uses the change of variables k — »■ 1 — k. 

For the second assertion, consider again the game of Figure [TJ We will show that 
[t Q 2 s ] = 0. Together with [s Ei t] > 0, as shown in the proof of Theorem 13.71 this leads 
to the result. To obtain the result, we will prove that for all d, we have: 

Vy 2 G V 2 {t)3x 2 G V 2 {s)3 yi G 2>i(t).Vfc G C{d). (Ef > Vl (k) - Ef> a {k)) = . 

where we have used the fact that player 1 at s plays x\ = a. Any mixed move y 2 G P 2 (t) 
can be written as y 2 = af + (1 — a)g for < a < 1. Choose yi = ac + (1 — a)b, and 



*2 



«i 1/ + \a 



,3" 3 

Under this choice of mixed moves, we have: 

4 



(1 



a 



5(t,yi,y 2 )(w) = -a + a(l - a) + -(1 - a) 

y 

2X +(l-a) 



<5(s,xi,x 2 )(w) 



1 1 

ai 1 

\3 3 3 3 



1 2 

3 /+ 3 5 



5 

~ 9 

2 2 1 

3 ' 3 + 3 



-a 



5 
9 



-a 



As the probabilities of transitions to w are equal from t and s, we obtain that for all 



k G C{d), we have Ef - E x s 2 ' a {k) = 0, as desired. 

For the third assertion, we consider a modified version of the game depicted in Figured! 
obtained by adding two new moves to player 2 at state t, namely /' and g' . We define the 
transition probabilities of these new moves by 

£(*>*>/') = S(s,a,f) 5(t,*,g f ) = 5(s,a,g) . 

To prove [s Qi t] > 0, we can proceed as in the proof of Theorem 13.71 noting that we can 
choose y 2 as in that proof (this is possible, as player 2 at t has more moves available in the 
modified game). This leads to [s =1 t] > 0. 

To show that [s = 2 t] = 0, given the transition structure of the game, it suffices to show 
that [s C 2 t] = and [t E 2 s] = 0. To show that [s C 2 t] = 0, we show that for all d, we 
have: 

Vx 2 G V 2 {s)3y 2 G £> 2 (i)-Wi G Z> x (t).Vfc G C(d). (Ef '"(£;) - Ef m (k)) = . 

We can write any mixed move x 2 G T> 2 {s) as x 2 = af + (1 — a)g. We can then choose 
2/2 = af + (1 — a)g', and since at i under /', g' the transition probabilities do not depend 
on the mixed move y\ chosen by player 1, we have that the transition probabilities from s 
and t match for all < a < 1. 

To show that [t Q 2 s] = 0, we need to show that: 

Vy 2 G V 2 {t)3x 2 G V 2 {s)3 Vl G d(t).VA: G C(d).{W[ 2,yi (k) — Ef 2 ' a (A;)) = . 

Any mixed move y 2 G T> 2 (t) can be written as 



2/2 = 7 



a/ + (1 - a)s + (1 - 7) Pf + (1 - 



for some ot,/3, 7 G [0, 1]. We choose x 2 and j/i as follows: 



£ 2 = 07 



\ f + \ 9 . 



+ (1 - a)7 



+ (1 - 7) /?/ + (1 - 



2/i = ac + (1 — a)6 . 
With these mixed moves, we have S(s, a, x 2 ) 



8(t,yi,y 2 ), leading to the result. 
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As a consequence of this theorem, we write [— g ] in place of [~i] = [—2], to emphasize that 
the player 1 and player 2 versions of game equivalence metrics coincide. 

3.5. Logical characterization of a priori metric. We now prove that q\x provides a 
logical characterization for the a priori metrics. We first state and prove two lemmas that 
lead to the desired result. The proof of the lemmas use ideas from [TO] and [TO]. We recall 
from Theorem 13.51 that we can compute [7^1] via Picard iteration, with d n = iJ" i (0) being 
the n-iterate. 

We prove the existence of a logical characterization via a sequence of the following two 
lemmas. The first lemma proves that a priori metrics provide a bound for the difference in 
value of qfi- formulas. 

Lemma 3.9. The following assertions hold for all game structures. 

(1) For all tp G q^i , and for all s,t G S, we have [y](s) — [</>](£) < [s r^i t]. 

(2) For all ip G qfjL, and for all s,t G S, we have |[y](s) — < [s ~ 9 t]. 

Proof. We prove the first assertion. The proof is by induction on the structure of a (possibly 
open) formula (p G qfi^ . Call a variable valuation £ bounded if, for all variables Z G MVars 
and states s,t, we have that £(Z)(s) — £(Z)(i) < [s ^1 t]. We prove by induction that for 
all s,t G S, for all bounded variable valuations £, we have — [v 3 ]^) < [s r^i t]. For 

clarity, we sometimes omit writing the variable valuation £. 

The base case for constants is trivial, and the case for observation variables follows 
since [s = t] < [s ^1 t]. The case for variables Z G MVars follows from the assumption of 
bounded variable valuations. For <p\ V <p 2 , assume the induction hypothesis for ipi, <p2, and 
note that 

([H(s)u[H(s))-(MWuM(t)) 

< - u {y 2 j(s) - y 2 j(t)) < [s r<i t] . 

The proof for A is similar. For <p\ © c and c, we have by induction hypothesis that 
[921] (s) — [</?i](i) < [s ^1 t] 3 and so the "shifted versions" also satisfy the same bound. 

For the induction step for pre^ assume the induction hypothesis for <p, and note that 
we can choose k G C([^i]) such that k(s) = {<p}(s) at all s G S. We have, for all s,t G S, 

[preiMl(s) - [pre^^Kt) < sup (Pre x (*)(*) - Prei (*;)(*)) < [s ^ t] . (3.29) 

fcec([di]) 

where the last inequality follows by noting that [^1] is a fixpoint of H^. 

The proof for the fixpoint operators is performed by considering their Picard iterates. 
We consider the case [j,Z.(p, the proof for i>Z.<p is similar. Let £ be a bounded variable 
valuation. Then, the variable valuation £0 = €[Z l— ► 0] is also bounded, and by induction 
hypothesis, the formula (p when evaluated in the variable valuation £0 satisfies 

M&(*)-M€o(*)<[*=M- (3-30) 
Now consider the variable valuation £1 = £[Z 1— » [y% ]. From Equation (|3.30p . we get that £1 
is bounded, and again, by induction hypothesis, we have that [v?]^ (s) — M^^) < [s ^1 £]■ 
In general, for k > 0, consider the variable valuation £^+1 = £[Z 1— > [y]^]. By the above 
argument, each variable valuation £& is bounded, and so for every k > 0, we have 

[¥>]&(*)-M&(*)< [«=<!*]■ (3-31) 
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Taking the limit, as k — > oo, we have that 

lim (Mt k (s) - Mt k (t)) = UlZ.^s) - UiZ.^(t) < [s r<i t] . (3.32) 

fe— >oo 

The proof of the second assertion can be done along the same lines, using the symmetry 
of ~ 9 . The proof is again by induction on the structure of the formula. In particular, f|3. 29[) 
can be proved for either player: for n > and i € {1,2}, 

[pre, ((/?)] (s) - [pre; (</?)](£) < sup (Pre^O) - Prei(fc)(t)) < [s ~ 9 t) . 

fceC([~ g ]) 

Negation can be dealt with by noting that [-"^(s) — [-"^(t)! = [^](i) — and by using 

the symmetry of ~ 9 ; the other cases are similar. I 

The second lemma states that the q/i formulas can attain the distance computed by 
the simulation metric. 

Lemma 3.10. The following assertions hold for all game structures G, and for all states 
s,tofG. 

[slit]< sup (M(s)-M(t)) 
[s~ g t]< sup \M(s)-M(t)\ 

Proof. We show by induction on n that d n (s,t) < sup ¥ , €g/J ([v3](s) — [</>](*))• The base case 
is trivial. For the induction step, the distance is: 

d l+1 (s,t)= sup (Prei(jfe)(s) -Prei(Jfe)(t)) . (3.33) 

fc€C(di) 

The challenge is to show that, for all s,t G S, we can construct a formula ip s t that witnesses 
the distance within an arbitrary e > 0: 

d l+ i (s,t)-e< Wist} (s) - [rp st ] (t) . (3.34) 

To this end, let k* be the value of k that realizes the sup in (|3,33[) within e/4. By induction 
hypothesis, for each pair of states s' and t' we can choose ip' s , t , such that 

di(s',t') - e/4 < [^W) ~ Ws't'W) • (3-35) 
Let (p s it' be a shifted version of <p' s r t r, such that (p s 'f[s') = k*(s'): 

Vs<t< = V' s i t > © W) - Ws't'W)) ■ (3.36) 

We now prove that: 

\f M{s') = k*(s') (3.37) 

IPrtW) < k*tf)+e/4. (3.38) 

Equality (|3.37p is immediate from (|3.36p . We prove (|3.38p as follows. We can rewrite (|3.35p 

as 

WtAtf) ~ ^4 < Ws't'W) - dWX) . (3.39) 
Since k* G we have k*(s') - k*(t') < di(s',t'), or 

k*(t') -**(*') > -di(s',t') . (3.40) 
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Plugging this relation into (13,39p . we obtain 

y s , t ,W) - e/4 < + fc*(f) - k*(s') . (3.41) 

Plugging this relation into (|3.36p evaluated at t', we obtain 

[^](0 - e/4 < + fc*(0 - fcV) e (fcV) - [^] («')), 

or 

- e/4 < fc*^) - (*V) - i^K*')) e (*V) - y s , t ,}(s')) < k*(t>), 

which proves (|3.38p . Define now (p s i = /\ t , tp g it>. From (|3.37[) and ()3.38|) we have 

ltpA(s') = k*(s>) (3.42) 

[wW)<Vtf)+e/4. (3.43) 

Define then <p = \j , <p s r. From (|3.42|) . (|3.43|) . we have that 

k*(s') < M(s') <k*(s')+e/4. (3.44) 

for all s' E S. As formula ip s t, we propose thus to take the formula pre(</?). From (|3.44|) . we 
have that |[^ s4 ](s) - Prei(Jfe*)(«) | < e/4, and similarly, |[V>«t](*) - Prei(Jfe*)(t)| < e/4. By 
comparison with (|3.33p . and by the fact that k* realizes the sup within e/4, we finally have 
(|3.34p . as desired. I 

From these two lemmas, we can conclude that {q/J,} provides a logical characterization 
for the a priori metrics, as stated by the next theorem. 

Theorem 3.11. The following assertions hold for all game structures G, and for all states 
s,tofG: 

[s ^ t] = sup (M(s) - M(t)) [s ~ 5 t] = sup \M(s) - M(t)\ 

We note that, due to Theorem 13. 71 an analogous result does not hold for the a posteriori 
metrics. Together with the lack of reciprocity of the a posteriori metrics, this is a strong 
indication that the a priori metrics, and not the a posteriori ones, are the "natural" metrics 
on concurrent games. 

Our metrics are not characterized by the probabilistic temporal logic PCTL [131 13]. I n 
fact, the values of PCTL formulas can change from true to false when certain probabilities 
cross given thresholds, so that PCTL formulas can have different boolean values on games 
that are very close in transition probabilities, and hence, very close in our metric. Quan- 
titative metrics such as the ones developed in this paper are suited to quantitative-valued 
formulas, such as those of q/s. 



3.6. The Kernel. The kernel of the metric [— g ] defines an equivalence relation ~ g on the 
states of a game structure: s ~ 9 t iff [s ~ g t] = 0. We call this the game bisimulation 
relation. Notice that by the reciprocity property of the game bisimulation relation is 
canonical: ~ i = ~ 2 = — «• Similarly, we define the game simulation preorder s ^1 t as the 
kernel of the directed metric [z^i], that is, s ^1 t iff [s ^1 t] = 0. Alternatively, it is possible 
to define ^1 and ~ 9 directly. Given a relation R C S x S, let B(R) C T consist of all 
valuations k G T such that, for all s,t G S, if sRt then k(s) < k(t). We have the following 
result. 
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Theorem 3.12. Given a game structure G, the relation <\ (resp. ~ij can be characterized 
as the largest (resp. largest symmetrical) relation R such that, for all states s,t with sRt, 
we have s = t and 

\/k G B(R).Vxi G V 1 (s).3y 1 G 2>i(t).Vjft G V 2 (t).3x 2 G V 2 (s).(E y t 1 ' m (k) > E x /' X2 (k)) . 

(3.45) 

Proof. The proof proceeds by induction on the computation of the fixpoint relation R. We 
first present the case for Hi. Call R n the n-th iterate of the simulation relation R, and 
let d n be the n-th iterate of [^i], as in Theorem 13.51 We prove by induction that, for all 
states s,t G S, we have sR n t iff d n (s,t) = 0. We define do(s,t) = [s = t]. The base case 
is then immediate because sRot iff do(s,t) = 0. Consider the induction step, for n > 0, 
and consider any states s,t G S. Assume first that d n+ \{s,t) > 0: then, it is easy to show 
that we can find a value for k in (|3.45p that witnesses (s,t) G" R n +i, since the constraints 
on k due to B(R n ) are weaker than those due to C(d n ). Conversely, assume that there is 
a k G B(R n ) that witnesses (s,t) G" R n +i- Then, by scaling all k values so that they are 
all smaller than the smallest non-zero value of d n (s',t') for any s',t' G S, we can find a 
k! G C(d n ) which also witnesses d n+ i(s,t) > 0, as required. 

The case for ~ 9 is analogous, due to the similarity of the Picard iterations (|3,14p for 

and (pT[T)D for ~ g . I 

We note that the above theorem allows the computation of ~ 9 via a partition-refinement 
scheme. From the logical characterization theorem, we obtain the following corollary. 

Corollary 3.13. For any game structure G and states s, t of G, we have s — g t iff [y](s) = 
[</?](£) holds for every tp G q/J, and s <\t iff [v?](s) < [</?](£) holds for every ip G qf-if ■ 

3.7. Relation between Game Metrics and (Bi-)simulation Metrics. The a priori 
metrics assume an adversarial relationship between the players. We show that, on turn- 
based games, the a priori bisimulation metric coincides with the classical bisimulation metric 
where the players cooperate. 

We define such "cooperative" simulation and bisimulation metrics [^12] and [—12] as 
the metric analog of classical (bi) simulation [221 25J. We define the metric transformers 
H^ 12 : M ^ M and H~ r2 : M i-> M, for all metrics d G M and s,t G S, by: 

B± 12 (d)(s,t) = [s = t] U sup sup sup inf inf {Ef ' xa (jfe) - Ef > V2 (k)} . 

kec(d) x 1 ev 1 (s) x 2 £V 2 (s) y 2 ev 2 (t) yiev^t) 

H~ 12 (d)(s,t) = H^ 12 (d){s,t) U% 2 (rf)(t,s) . 

The metrics [^12] and [—12] are defined as the least fixed points of H^ 12 and -ff~ 12 re- 
spectively. The kernel of these metrics define the classical probabilistic simulation and 
bisimulation relations. 

Theorem 3.14. The following assertions hold. 

(1) On turn-based game structures, [— g ] = [—12]- 

(2) There is a deterministic game structure G and states s,t in G such that [s ~ 3 t] > 
[s ~i 2 t]. 

(3) There is a deterministic game structure G and states s,t in G such that [s ~ 9 t] < 
[s ^12 t]. 
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Figure 2: [s ~ g t] = \ and [s ~i 2 t] = 




Figure 3: [s ~ 9 i] = but [s ~ 12 i] = 1. 

Proof. For the first part, since we have turn-based games, only one player has a choice of 
moves at each state. We say that a state s belongs to player i € {1,2} if player ~i has 
only one move at s. First, notice that due to the presence of the variable turn, the metric 
distance between states belonging to different players is always 1, for all the metrics we 
consider. Thus, we focus on the metric distances between states belonging to the same 
player. Consider two player 1 states s,t G S. From the definitions of and H^, 12 , for 
d G M, by dropping the moves of player 2, it is easy to see that H-^ 1 (d) = H^ 12 (d), and 
H~ g (d) = H~ 12 (d). Since this holds for any d G A4, it holds for the fixpoints, [— g ] and 
[-la]- 

The second part is proved by the game in Figure El where [s = t] = and [u = v] = 1. 
The latter yields [u ~ 9 v] = 1. Since player 1 has no choice of moves at state s, the 
maximum probability with which player 1 can guarantee a transition to either state u or 
state v is 0. But from state t, by playing moves a, b with probability | each, player 1 can 
guarantee reaching states u and v with probability I, which implies that over all k G C(d), 
given that d(u,v) = 1 from [u ~ 9 v] = 1, the maximum k expectation that player 1 can 
guarantee is 5. Therefore [s ~ g t] = ^. But if player 2 co-operates, then [s ~i2 t] = 0. 

The third part is proved by the game in Figure [3] where again [s = t] = and [u = v] = 1. 
Since the players don't have any moves to transition to state v from state t, [s ~i 2 t] = 1, 
whereas [s ~ 9 t] = 0. I 

If we consider Markov decision processes (MDPs), we have that on i-MDPs, the metric 
coincides with ^12, since player ~z has no moves, for i G {1,2}. On the other hand, the 
metric provides no information on ^12. 

Theorem 3.15. The following assertions hold. 

(1) For i-MDPs we have [^] = [X 12 ]. 

(2) There is a deterministic 2-MDP G with states s,t such that [s ^1 t] < [s ^12 t]. 

(3) There is a deterministic 2-MDP G with states s,t such that [s ^1 t] > [s ^12 t]. 

Proof. From the definitions of and H^ 12 , restricted to MDPs, where only one player 
has a choice of moves, the first assertion follows. 
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t 



Figure 4: [s t] = and [s < 12 t] = 1. Also, [t <i s] = 1 and [t < 12 s] = 0. 

The second and third assertions are proved by the deterministic 2-MDP in Figure [U 
where again [s = t] = and [u = v] = 1. For the second assertion we note that since 
d(u,v) = 1, for any choice of k £ C(d), player 1 cannot get a higher expectation of k from 
state s when compared to state t, because at state s, player 2 always has a move that will 
lead to a state yielding a lower k expectation. Therefore, [s t] = 0. Further, for k{v) = 1 
and k(u) = 0, which satisfies the constraints on k, we have no moves for either player from 
state t, which implies [s ^12 t] = 1. 

We prove the third assertion by showing that, for the 2-MDP of Figure HI we have 
[t ^1 s] > [t ^12 s] (which is the third assertion, with s and t exchanged). Note that when 
player 2 cooperates, the expectation of any k € C(d) from state s is always at least as much 
as the expectation from state t. Thus [t ^12 s] = 0. Finally, there exists a k € C{d), with 
k{u) = 1 and k(v) = 0, for which [i ^1 s] = 1, which completes the proof. I 

3.8. Computation. We now show that the metrics are computable to any degree of preci- 
sion. This follows since the definition of the distance between two states of a given game, as 
the least fixpoint of the metric transformer (|3.9p . can be written as a formula in the theory 
of reals, which is decidable [29]. Since the distance between two states may not be rational, 
we can only guarantee an approximate computation in general. 

Without loss of generality, we assume that the states of G are labeled {si, . . . , s n } 
for some n G IN. The construction is standard (see, e.g., [7]), we recapitulate the main 
steps. We denote by R the real-closed field (K, +, •, 0, 1, <) of the reals with addition and 
multiplication. An atomic formula is an expression of the form p > or p = where p is 
a (possibly) multi-variate polynomial with integer coefficients. An elementary formula is 
constructed from atomic formulas by the grammar 

(p ::= a \ -up \ cp A (p \ ip V ip \ 3x.(p \ Vx.ip, 

where a is an atomic formula, A denotes conjunction, V denotes disjunction, -1 denotes 
complementation, and 3 and V denote existential and universal quantification respectively. 
We write 92 — > <p' as shorthand for -199 V ip' . The semantics of elementary formulas are given 
in a standard way [1]. A variable x is free in the formula <p if it is not in the scope of a 
quantifier 3x or Vx. An elementary sentence is a formula with no free variables. The theory 
of real-closed fields is decidable [29] . 

We introduce additional atomic formulas as syntactic sugar: for polynomials p\ and 
P2, we write p\ = P2 for pi — P2 = 0, p\ > P2 for p\ — P2 > 0, and p\ > P2 for pi — P2 = 
V p\ — P2 > 0. Also, we write p\ < P2 for P2 > p\ and p\ < P2 for P2 > p\ ■ Let x, y denote 
vectors of variables, where the dimensions of the vectors will be clear from the context. For 
~€ {=, <, >}, we write x ~ y for the pointwise ordering, that is, if Ai 2 ^ ~ Vi- A subset 
C C ]R m is definable in R if there exists an elementary formula <pc{x) such that for any 
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xq G IR m , we have <pc(xo) holds in R iff xq G C. A function / : IR fc — > IR m is definable in 
R if there exists an elementary formula (pf(y,x) with free variables y, x such that for all 
constants yo G !R m and xq G IR fc the formula ipf(yo,xo) is true in R iff yo = /(xo)- We 
start with some simple observations about definability. 

Lemma 3.16. (a) If functions f\ : IR A: — > IR m and f2 : IR fc — > IR m are definable in R f/ien 
so are £/ie functions 

(/i-/ 2 )(f) = /i(x)-/ 2 (x) 

(/lU/ 2 )(f) = /i(f)U/ 2 (f) 
ffrj // / : IR fe+z — > IR m is definable in R, and C C M fc is definable in R, i/ien (sup c /) : 
JR l -> IR m de/ined as 

(sup/)(y) = sup f{x,y) 
c xGC 

is definable in R. 

Proof. For part (a) , let 991 (y , x) and ip2 (y, x) be formulas defining f\ and / 2 respectively. 
Then, f\ — f 2 is defined by the formula 

3Fi.3z 2 .(v9i(zi,f) A (f2(z 2 ,x) Ay = z 1 - z 2 ), 

and fi U _f 2 is defined by the formula 

3zi.3z2.(tpi(zi,x) A (p 2 (z2,x) A y\ [(zi,, > z 2> i A = zi^) V (zy < ? 2 ,i A y» = %)]) . 

i 

For part (b), let (pf(z,x,y) define /, where x is of dimension k, y of dimension I, and £ 
of dimension m, respectively. Let ipc{x) define C. Then, the following formula with free 
variables z, y (call it ip{z,y)) states that z is an upper bound of f(x,y) for all x G C: 

Vsi.V^i.(^c(«l) A ipf(zi,x u y) -> z*i < z), 

and sup^ / is defined by the formula with free variables z, y given by: 

tp(z,y) A Vzi.((/?(zi,y) -> z < z\) . 

I 

Theorem 3.17. Let G be a game structure and s,t states of G. For all rationals v, and 
all e > 0, it is decidable if \[s ^1 t] — v\ < e and if |[s ~ 9 t] — v\ < e. It is decidable if s <\t 
and if s ~ 3 t. 

Proof. First, we use a result of Weyl [33] that the minmax value of a matrix game with 
payoffs in IR can be written as an elementary formula in the theory of real-closed fields. 
This implies that for any state s, the function Prei(fe)(s) is definable in R. Also, for 
d G «M, the set C(d) is definable in R (since conjunctions of linear constraints are definable 
in R). Hence, by Lemma fc.lfif a) and (b), we have that sup^ gC ^ (Prei(fc)(s) — Prei (£)(£)) 
is definable for any metric d G M, and states s and t of G. By another application of 
Lemma 13.16( a). we have that the function 

iT dl (d)(s,t) = (s = t) U sup (Prei(£(s) - Prei (£)(*)) . 

keC(d) 

is definable for d G A4 and states s and t of G. 

Consider the set of free variables {y(s,t),d(s,t) \ s,t G S}, where d is a vector of n 2 
free variables defining the metric d, and where y is a vector of n 2 variables. Let tp(y, d) be 
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a formula in R, with free variables in the above set, such that tp(y, d) is true iff y(s, t) = 
H-^ 1 (d)(s,t) holds for all s,t E S. Then the formula <p*(y) with free variables y, defined as: 

3d.(ip(y,d) Ay = d), 

defines a fixpoint of H~^ 1 (d). Finally, the formula if)(y), given by 

<p*{y)AVy'.{<p*{y')^y<y'). 

defines the least fixpoint of (again, y' = {y'(s,t) \ s,t G S} is a matrix of n 2 variables, 
and y < y' iff y(s,t) < y'(s,t) for all s,t G S). Thus, tp(y) is true iff y(s,t) = [s t] for all 
s,t G S. 

While this shows that [s t] is algebraic, there are game structures G with all transi- 
tion probabilities being rational, but with states s and t of G such that [s t] is irrational. 
So, we use the formula above to approximate the value of [s t] to within a constant e. 
For states s, t and rationals v, e, we have that |[s ^1 t] — u| < e iff 3y.(ip(y) A \y(s, t) — v\ < e) 
is valid, and this can be decided since R is decidable. 

A similar construction shows that the question whether | [s — g t] — v \ < e, is decidable 
for states s, t and rationals v, e: we ensure that y is a symmetric fixpoint by conjoining to 
<p*(y) constraints y(s,t) = y(t,s) for all states s,t. 

If the formula 3y.(^(y) A y(s, t) = 0), where we assert that the distance between s and 
t is zero, is valid, we can conclude that s t. This implies that the relation s t is 
decidable for any game structure G and states s and t of G. A similar construction for ~ g 
shows that the relation s ~ 9 t is also decidable for any game structure G and states s, t of 
G. I 

4. Discussion 

Our derivation of <i and ~ g , for i G {1,2}, as kernels of metrics, seems somewhat 
abstruse: most equivalence or similarity relations have been defined, after all, without 
resorting to metrics. We now point out how a generalization of the usual definitions |25} [2j 
EldO], suggested in [6j[T9], fails to produce the "right" relations. Furthermore, the flawed 
relations obtained as a generalization of |25} [2J [9j [TO] are no simpler than our definitions, 
based on kernel metrics. Thus, our study of game relations as kernels of metrics carries no 
drawbacks in terms of leading to more complicated definitions. Indeed, we believe that the 
metric approach is the superior one for the study of game relations. 

We outline the flawed generalization of [25l [21 [91 [TO] as proposed in [61 [TO], explaining 
why it would seem a natural generalization. The alternating simulation of [2] is defined over 
deterministic game structures. Player-z alternating simulation, for i G {1,2}, is the largest 
relation R satisfying the following conditions, for all states s,t G S: s Rt implies s = t and 
Vxi G r»(s) . 3yi G . Vy^ G T„i(t) . Bx^i G T^i(s) . t(s,x 1 ,x 2 ) RT(t,y 1 ,y 2 ). 

The MDP relations of [25], later extended to metrics by [9j [TO], rely on the fixpoint 
(|3.2p . where sup plays the role of V, inf plays the role of 3, and R is replaced by distribution 
equality modulo R, or This strongly suggests — incorrectly — that equivalences 

for general games (probabilistic, concurrent games) can be obtained by taking the double 
quantifier alternation V3V3 in the definition of alternating simulation, changing all V into 
sup, all 3 into inf, and replacing R by Cr. The definition that would result is as follows. 
We parametrize the new relations by a player i G {1,2}, as well as by whether mixed moves 
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or only pure moves are allowed. For a relation R C S x S, for M G {r,P}, for all s,t 6 5 
and « G {1,2} consider the following conditions: 

• (loc) s R t implies s = t. 

• (M-i-altsim) s i? t implies 

Vxi G Mj(s) . 3yi G A*i(t) . Vy~i G M^(i) . G M^(s) . 5(a,a;i,a;2) Efl 6(t,y 1 ,y 2 ); 
We then define the following relations: 

• For i G {1,2} and M G {r,2?}, player-i M-alternating simulation Qf 1 is the largest 
relation that satisfies (loc) and (M-i-altsim). 

• For i G {1,2} and M G {F,T)}, player-i M-alternating bisimulation =f* is the largest 
symmetrical relation that satisfies (loc) and (M-i-altsim). 

Over deterministic game structures, the definitions of and =f coincide with the alternat- 
ing simulation and bisimulation relations of [2]. In fact, cj 1 and =J capture the deterministic 
semantics of q/i, and thus in some sense generalize the results of [2] to probabilistic game 
structures. 

Theorem 4.1. For any game structure G and states s,t of G, the following assertions hold: 

(1) t iff M T (s) = M r (t) holds for every <p G q^. 

(2) s Cf t iff M r (s) < l V f(t) holds for every <p G qn+. 

The following lemma states that and = f are the kernels of [Cj] and [=i\ , connecting 
thus the result of combining the definitions of [25] and [2] with a posteriori metrics. 

Lemma 4.2. For all game structures G, all players i G {1,2}, and all states s,t of G, we 
have sQf t iff [s C; t] = 0, and s^P tiff [s ^ t] = 0. 

We are now in a position to prove that neither the T-relations not the P-relations 
are the "canonical" relations on general concurrent games, since neither characterizes {qfij. 
In particular, the 2?-relations are too fine, and the T-relations are incomparable with the 
relations ^ and ~ s , for i G {1, 2}. We prove these negative results first for the P-relations. 
They follow from Theorem 13.71 and 13.111 

Theorem 4.3. The following assertions hold: 

(1) For all game structures G, all states s,t of G, and all i G {1,2} ; we have that s t 
implies s ^ t, and s =f t implies s ~j t. 

(2) There is a game structure G, and states s,t of G, such that s^it but s %f t. 

(3) There is a game structure G, and states s,t of G, such that [</?](s) = [</?](£) for all 
(p G q\i, but s t f or some i G {1, 2}. 

We now turn our attention to the T-relations, showing that they are incomparable with 
<i and ~p, for i G {1, 2}. 

Theorem 4.4. The following assertions hold: 

(1) There exists a deterministic game structure G and states s,t of G such that s ^ but 
s 2?i t, and s =\ t but s rf± g t. 

(2) There exists a turn-based game structure G and states s,t of G such that s ^ t but 
s 2i t- an d s — g t but s 9=1 t. 

Proof. The first assertion is proved via the deterministic game in Figure [5j where [s = t] = 
and [u = v] = 1 and T\(s) = T2(s) = {a, b} and T\(t) = T2(t) = {a, b, c}. In the figure, we 
use the variables x and y to represent moves: if player 1 and player 2 moves coincide, u is 
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t 



Figure 5: s t but s i±\ t and s =\ t but s ^ g t 




Figure 6: s -<\ t but s t and s ~ 9 t but s ^ t. 



the successor state, otherwise it is v. Thus, the game from s is the usual "penny-matching" 
game; the game from t is a version of "penny-matching" with 3-sided pennies. 

It can be seen that s O n the other hand, we have s t. Indeed, from state 

s, by playing both a and b with probability |, player 1 can ensure that the probability 
of a transition to u is |. On the other hand, from state t, player 1 can achieve at most 
probability | of reaching u (this maximal probability is achieved by playing all of a, b, c 
with probability ^). The result then follows using Theorem 13.111 

The second assertion is proved via the game in Figure [6J We have s 2i t'- clearly, 
player-l's move c at state s cannot be mimicked at t when the game is restricted to pure 
moves. On the other hand, we have s t: since the move c at s can be imitated via the 
mixed move that plays both a and b at t with probability ^ each, all q[i formulas have the 
same value, under [•] , at s and t, and the result follows once more using Theorem 13. Ill 

I 

Finally, we remark that, in view of Theorem 13.121 the definitions of the relations -<% 
and ~ g for i E {1,2} are no more complex than the definitions of C^ 3 , =Y, and =\. 

5. Conclusions 

We have introduced the metrics and relations that constitute the natural generalizations 
of simulation and bisimulation to stochastic games on graphs. These relations and metrics 
are tight, in the sense that the distance between two states is equal to the maximum 
difference in value that properties of the quantitative /^-calculus can assume at the two 
states: in other words, the relations characterize quantitative ^-calculus, in the same way 
in which ordinary bisimulation characterizes ^-calculus. The paper also provided a full 
picture of the connection between the new metrics and relations, and the relations previously 
considered for games. 
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The main point left open by the paper concerns the algorithms for the computation of 
the relations and metrics. The algorithms we provided rely on the decidability of the theory 
of reals; it is an open question whether more efficient, and more direct, algorithms exist, 
for the metrics or at least for the relations. 
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